{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "source": [
        "Copyright (c) Microsoft Corporation. Licensed under the MIT license.\n",
        "\n",
        "# Train and Deploy a model using Feast\n",
        "\n",
        "In this notebook we show how to:\n",
        "\n",
        "1. access a feature store \n",
        "1. discover features in the feature store\n",
        "1. train a model using the offline store (using the feast function `get_historical_features()`)\n",
        "1. use the feast `materialize()` function to push features from the offline store to an online store (redis)\n",
        "1. Deploy the model to an Azure ML endpoint where the features are consumed from the online store (feast function `get_online_features()`)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Connect to Feature store\n",
        "\n",
        "Below we create a Feast repository config, which accesses the registry.db file and also provides the credentials to the offline and online storage. These credentials are done via the Azure Keyvault."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": true,
        "gather": {
          "logged": 1627130565121
        },
        "jupyter": {
          "outputs_hidden": false,
          "source_hidden": false
        },
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "outputs": [],
      "source": [
        "import os\n",
        "from feast import FeatureStore\n",
        "from azureml.core import Workspace\n",
        "\n",
        "# access key vault to get secrets\n",
        "ws = Workspace.from_config()\n",
        "kv = ws.get_default_keyvault()\n",
        "os.environ['REGISTRY_PATH']=kv.get_secret(\"FEAST-REGISTRY-PATH\")\n",
        "os.environ['SQL_CONN']=kv.get_secret(\"FEAST-OFFLINE-STORE-CONN\")\n",
        "os.environ['REDIS_CONN']=kv.get_secret(\"FEAST-ONLINE-STORE-CONN\")\n",
        "\n",
        "# connect to feature store\n",
        "fs = FeatureStore(\"./feature_repo\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### List the feature views\n",
        "\n",
        "Below lists the registered feature views."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "fs.list_feature_views()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "collapsed": true,
        "gather": {
          "logged": 1627130724228
        },
        "jupyter": {
          "outputs_hidden": false,
          "source_hidden": false
        },
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "source": [
        "## Load features into a pandas dataframe\n",
        "\n",
        "Below you load the features from the feature store into a pandas data frame."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": true,
        "gather": {
          "logged": 1626933777036
        },
        "jupyter": {
          "outputs_hidden": false,
          "source_hidden": false
        },
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "outputs": [],
      "source": [
        "sql_job = fs.get_historical_features(\n",
        "    entity_df=\"SELECT * FROM orders\",\n",
        "    features=[\n",
        "        \"driver_stats:conv_rate\",\n",
        "        \"driver_stats:acc_rate\",\n",
        "        \"driver_stats:avg_daily_trips\",\n",
        "        \"customer_profile:current_balance\",\n",
        "        \"customer_profile:avg_passenger_count\",\n",
        "        \"customer_profile:lifetime_trip_count\",\n",
        "    ],\n",
        ")\n",
        "\n",
        "training_df = sql_job.to_df()\n",
        "training_df.head()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "source": [
        "## Train a model and capture metrics with MLFlow"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import mlflow\n",
        "import numpy as np\n",
        "from sklearn.model_selection import train_test_split\n",
        "from sklearn.ensemble import RandomForestClassifier\n",
        "from azureml.core import Workspace\n",
        "\n",
        "# connect to your workspace\n",
        "ws = Workspace.from_config()\n",
        "\n",
        "# create experiment and start logging to a new run in the experiment\n",
        "experiment_name = \"order_model\"\n",
        "\n",
        "# set up MLflow to track the metrics\n",
        "mlflow.set_tracking_uri(ws.get_mlflow_tracking_uri())\n",
        "mlflow.set_experiment(experiment_name)\n",
        "mlflow.sklearn.autolog()\n",
        "\n",
        "training_df = training_df.dropna()\n",
        "X = training_df[['conv_rate', 'acc_rate', 'avg_daily_trips', \n",
        "        'current_balance', 'avg_passenger_count','lifetime_trip_count' ]].dropna()\n",
        "y = training_df['order_is_success']\n",
        "\n",
        "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)\n",
        "clf = RandomForestClassifier(n_estimators=10)\n",
        "\n",
        "# train the model\n",
        "with mlflow.start_run() as run:\n",
        "    clf.fit(X_train, y_train)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Prepare for deployment\n",
        "\n",
        "### Register model and the feature registry "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# register the model\n",
        "model_uri = \"runs:/{}/model\".format(run.info.run_id)\n",
        "model = mlflow.register_model(model_uri, \"order_model\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### `materialize()` data into the online store (redis)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from datetime import datetime, timedelta\n",
        "\n",
        "end_date = datetime.now()\n",
        "start_date = end_date - timedelta(days=365)\n",
        "fs.materialize(start_date=start_date, end_date=end_date)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Set up deployment configuration\n",
        "\n",
        "__Note: You will need to set up a service principal (SP) and add that SP to your blob storage account as a *Storage Blob Data Contributor* role to authenticate to the storage containing the feast registry file.__\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "`az ad sp create-for-rbac -n $sp_name --role \"Storage Blob Data Contributor\" \\\n",
        "--scopes /subscriptions/$sub_id/resourceGroups/$rg_name`"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Once you have set up the SP, populate the `AZURE_CLIENT_ID`, `AZURE_TENANT_ID`, `AZURE_CLIENT_SECRET` environment variables below."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core.environment import Environment\n",
        "from azureml.core.webservice import AciWebservice\n",
        "from azureml.core import Workspace\n",
        "\n",
        "ws = Workspace.from_config()\n",
        "keyvault = ws.get_default_keyvault()\n",
        "\n",
        "# create deployment config i.e. compute resources\n",
        "aciconfig = AciWebservice.deploy_configuration(\n",
        "    cpu_cores=1,\n",
        "    memory_gb=1,\n",
        "    description=\"orders service using feast\",\n",
        ")\n",
        "\n",
        "# get registered environment\n",
        "env = Environment(\"feast-env\")\n",
        "env.docker.base_image = None\n",
        "env.docker.base_dockerfile = \"./inference.dockerfile\"\n",
        "env.python.user_managed_dependencies = True\n",
        "env.inferencing_stack_version = 'latest'\n",
        "env.python.interpreter_path = \"/azureml-envs/feast/bin/python\"\n",
        "\n",
        "# again ensure that the scoring environment has access to the registry file\n",
        "env.environment_variables = {\n",
        "    \"FEAST_SQL_CONN\": fs.config.offline_store.connection_string,\n",
        "    \"FEAST_REDIS_CONN\": fs.config.online_store.connection_string,\n",
        "    \"FEAST_REGISTRY_BLOB\": fs.config.registry.path,\n",
        "    \"AZURE_CLIENT_ID\": \"PROVIDE YOUR SERVICE PRINCIPLE CLIENT ID HERE\",\n",
        "    \"AZURE_TENANT_ID\": \"PROVIDE YOUR SERVICE PRINCIPLE TENANT ID HERE\",\n",
        "    \"AZURE_CLIENT_SECRET\": \"PROVIDE YOUR SERVICE PRINCIPLE CLIENT SECRET HERE\"\n",
        "}"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Deploy model\n",
        "\n",
        "Next, you deploy the model to Azure Container Instance. Please note that this may take approximately 10 minutes."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import uuid\n",
        "from azureml.core.model import InferenceConfig\n",
        "from azureml.core.environment import Environment\n",
        "from azureml.core.model import Model\n",
        "\n",
        "# get the registered model\n",
        "model = Model(ws, \"order_model\")\n",
        "\n",
        "# create an inference config i.e. the scoring script and environment\n",
        "inference_config = InferenceConfig(\n",
        "    entry_script=\"./src/score.py\", \n",
        "    environment=env, \n",
        "    source_directory=\"src\"\n",
        ")\n",
        "\n",
        "# deploy the service\n",
        "service_name = \"orders-service\" + str(uuid.uuid4())[:4]\n",
        "service = Model.deploy(\n",
        "    workspace=ws,\n",
        "    name=service_name,\n",
        "    models=[model],\n",
        "    inference_config=inference_config,\n",
        "    deployment_config=aciconfig,\n",
        ")\n",
        "\n",
        "service.wait_for_deployment(show_output=True)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Test service\n",
        "\n",
        "Below you test the service. The first score takes a while as the feast registry file is downloaded from blob. Subsequent runs will be faster as feast uses a local cache for the registry."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import json\n",
        "\n",
        "input_payload = json.dumps({\"driver\":50521, \"customer_id\":20265})\n",
        "\n",
        "service.run(input_data=input_payload)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Clean up service"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "service.delete()"
      ]
    }
  ],
  "metadata": {
    "kernel_info": {
      "name": "newenv"
    },
    "kernelspec": {
      "display_name": "Python 3.8.12 64-bit ('feast_env')",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.8.12"
    },
    "microsoft": {
      "host": {
        "AzureML": {
          "notebookHasBeenCompleted": true
        }
      }
    },
    "nteract": {
      "version": "nteract-front-end@1.0.0"
    },
    "vscode": {
      "interpreter": {
        "hash": "b81e56dd72a0de84f7bcdac7bc848ecf5d1ed9826cc75d6e0cb7b6dbe5b95a6d"
      }
    }
  },
  "nbformat": 4,
  "nbformat_minor": 2
}
